Support for BMC cards based on Aspeed 2720 - Phase 2#26002
Support for BMC cards based on Aspeed 2720 - Phase 2#26002chander-nexthop wants to merge 6 commits intosonic-net:masterfrom
Conversation
files - We dont intent to use obmc-console, so remove the same from platform packages - Removed the remnant setup.py Signed-off-by: Chandrasekaran Swaminathan <chander@nexthop.ai>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Chandrasekaran Swaminathan <chander@nexthop.ai>
systemd services. This way platform specific changes for the same made to build_debian script cn be removed. 2. Fix the watchdog logic. Add a new systemd service that will pet the watchdog every 60 seconds (after arming it for 180 seconds) 3. Differentiate between user issued reboot and watchdog reset 4. Create a script called platform/aspeed/install-sonic-to-emmc.sh which can be used to burn the image to emmc from OpenBMC Signed-off-by: Chandrasekaran Swaminathan <chander@nexthop.ai>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
yxieca
left a comment
There was a problem hiding this comment.
Code Review
This PR is a cleanup/refactor for the Aspeed BMC platform in SONiC. Here's a breakdown:
Changes Summary
1. Platform services → proper Debian package ✅ Good
- Moves scripts/services from ad-hoc
build_debian.shinstallation into a properaspeed-platform-servicesDebian package (.mk,debian/scaffolding,rules.mk,one-image.mk) - This is the right pattern — proper packaging vs. raw file copies in build scripts
2. New watchdog keepalive daemon (watchdog-keepalive.sh)
- 180s timeout, 60s keepalive interval
- Custom log rotation (5 files max)
- Uses
watchdogutil arm/disarmfor setup/teardown
3. Watchdog platform API refactor (both AST2700 EVB and NextHop)
- Removes
self.armedtracking in favor of sysfs state (is_armed()reads sysfsstate) - Removes
_close_watchdog()(magic 'V' close) — replaces with_disablewatchdog()using ioctlWDIOS_DISABLECARD - Adds proper ioctl methods:
_enablewatchdog,_keepalive,_settimeout,_gettimeout,_gettimeleft arm()now caps at 300s max — is this intentional? Some use cases may need longer timeouts
4. Reboot cause logic improvement (both chassis.py files)
WDIOF_CARDRESETnow checks for software reboot cause file before defaulting to watchdog- Previously always returned
NON_HARDWARE, now returnsWATCHDOGwhen no software cause found — good fix
5. Removed BMC component from platform.json (both EVB and NextHop)
- Removes "BMC" from components list — why? Is BMC FW update not supported via
fwutil?
6. Removed obmc-console configs (both EVB and NextHop)
- Deleted
server.tty.conffiles and related install logic — are these now managed elsewhere (OpenBMC side)?
7. USB network init fix (NextHop)
- Hardcodes UDC name to
12021000.usb-vhub:p1instead of auto-detecting first available — could this break on different AST2700 boards?
8. New eMMC installer script (install-sonic-to-emmc.sh)
- Installs SONiC from OpenBMC to eMMC with U-Boot env setup
- Has commented-out alternatives and hardcoded
BOOTCONF="nexthop-b27-r0"— should these be configurable rather than requiring source edits?
Issues
1. Duplicate code across EVB and NextHop
chassis.pyreboot cause changes are identical in bothast2700/andnexthop/common/watchdog.pychanges are identical in both- Should this be a shared base class?
2. Commented-out code left in
build-emmc-image-installer.sh:#EMMC_SIZE_MB=7168install-sonic-to-emmc.sh:#BOOTCONF="ast2700-evb"- These should be removed or made configurable
3. watchdog-keepalive.sh logs every 60s
echo "Watchdog keepalive sent"on every iteration generates ~1440 log lines/day — consider removing or making it periodic (every Nth keepalive)
4. Deleted ast2700/setup.py — is the platform wheel now built differently, or was this unused?
5. PR title is vague — "Address pending review comments and some cleanup" doesn't convey the scope. This has significant changes (new Debian package, watchdog refactor, reboot cause fix, eMMC installer).
This platform itself is a BMC and makes little sense to have a BMC component within it was the point Judy made in the code review. That made sense and hence removed it.
We no longer use obmc-console, instead we have moved to use SONiC's native consutil service and hence removed this.
This is under the Nexthop platform module and is specific to our card. Will not affect the rest.
Fair point. As these scripts are helper scripts (not part of the image build) I didn't make them configurable. If we really need that I can make it as such. Do let me know.
If we make this a shared base class, then we will need a new intermediate python wheel package. Also, chassis.py will change for nexthop card to accommodate eeprom and other features. As of today there are same, but expected to diverge.
Eval board ships with a 8G eMMC whereas Nexthop card will ship with a 32G (but 10G due to pSLC). To catrer
Thanks for pointing out, Will fix.
yes it was unused.
Will fix the title. |
|
|
||
| [Service] | ||
| Type=simple | ||
| ExecStart=/usr/bin/watchdog-keepalive.sh |
There was a problem hiding this comment.
do we need infra change to copy the watchdog script from platform/aspeed/aspeed-platform-services/scripts/watchdog-keepalive.sh to /usr/bin -- as I see some line relating to this removed in build_debian.sh ?
There was a problem hiding this comment.
No, we now have a separate debian package, aspeed-platform-services. Its install rules takes care of this. Earlier we were doing it from build_debian.sh directly with a platform check. That was a stop gap.
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…mbers
This commit adds support for Nexthop Aspeed BMC platforms to
address chassis identification and USB network interface naming
requirements.
1. Chassis DB Initialization (Dual Serial Numbers)
- Add custom nexthop_bmc_chassis_db_init script to populate both:
* BMC serial number from BMC EEPROM (i2c-4)
* Switch-Host serial number from switch card EEPROM (i2c-10)
- Set skip_chassis_db_init=true in pmon_daemon_control.json
- Create platform-specific supervisor config for pmon container
- Modify docker_init.j2 to copy Aspeed platform supervisor configs
- Enable syseepromd daemon for EEPROM information population
2. EEPROM Platform API Implementation
- Add eeprom.py: BMC EEPROM handler with system serial reading
- Add eeprom_utils.py: Nexthop-specific TLV decoder with custom
Vendor Extension fields (IANA 63074)
- Update chassis.py to use EEPROM API for serial number retrieval
3. USB Network Interface Naming
- Add 70-usb-network.rules to rename usb0 → bmc0
- Add sonic-usb-network-udev-init.sh to install udev rules
- Add sonic-usb-network-udev-init.service (runs before usb-network-init)
- Update usb-network-init.sh to expect bmc0 interface name
- Enable udev service in debian postinst
4. Watchdog Improvements
- Simplify get_time_left() to return -1 (remove unsupported ioctl)
Testing:
- Verified on Nexthop BMC Card
- show version displays both serial numbers
- USB interface named bmc0
- All pmon daemons start successfully
Signed-off-by: Chandrasekaran Swaminathan <chander@nexthop.ai>
b52abd6 to
b216b82
Compare
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
itself as type MODULE_TYPE_SWITCH_HOST. This is added to the collection of modules in the chassis object of the BMC and implements the methods set_admin_state, do_power_cycle, get_oper_status etc as outline in the HLD sonic-net/SONiC#2215 Change switch_cpu_utils.sh to use this so that these can be manually tested even before we integrate with bmcctld As required by the HLD introduce a bmc.json file that contains the IPv4 address to be used for the usb network channel between the BMC and the switch cpu. Change usb-network-init.sh to program the IP address contained herein on the bmc0 interface. Signed-off-by: Chandrasekaran Swaminathan <chander@nexthop.ai>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Introduce a attribute (default false) in ChassisBase to represent BMC. Override that to true in BMC's Chassis class implementation. Use this to selectively populate the switch host serial attribute in the Chassis DB and print the same in show version. While at this, move the get_serial to SwitchHostModule and expose a new method get_switch_host_serial, which invokes the SwitchHostModule's get_serial method. Signed-off-by: Chandrasekaran Swaminathan <chander@nexthop.ai>
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
files
Why I did it
Changes based on review comments received and to cater to the platform HLD sonic-net/SONiC#2215
Work item tracking
How I did it
How to verify it
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)